smartcontract,controller: RFC-18 devnet fixes — node segment assignment and community stamping#3648
smartcontract,controller: RFC-18 devnet fixes — node segment assignment and community stamping#3648
Conversation
2b95b4c to
f15af55
Compare
elitegreg
left a comment
There was a problem hiding this comment.
This solution pretty much requires that you never have more than about single digit or so number of topologies. As written today, it could be up to 25, but that assumes there is room in the packet, which is why I'm saying 8 or so. Is this a reasonable limitation? It seems to me like it is not as we may have several topologies per tenant?
It would be a decent limit. I can't imagine us having that many. I think other technologies become more interesting for us without have 10s of Flex-Algo topologies. Worst case, if we did need more than 8, or even more than 25, is that difficult to achieve? |
Difficult to achieve? Probably what we would need to do is come up with a more flexible implementation and create new version of the create interface command and retire the old one (this latter case is a big deal, with the activator instructions being the first instructions we are actually retiring). I think I would feel better if we had an instruction for adding a flex algo to an interface, and the having the interface create sdk (client side) call create interface, followed by iteratively adding the flex algo (each would allocate a segment routing id). My preference would be to do it this way now and not bork up create interface in a way that may only be temporary. |
Community stamping only resolved colors for users with a tenant assigned. Since most users (all of mainnet, most of testnet) have no tenant, they were silently skipped. Add an else branch that calls resolveTenantColors with nil includeTopologies, which falls through to the default UNICAST-DEFAULT color.
When creating a VPNv4 loopback with onchain allocation, the instruction now accepts optional topology PDA accounts and allocates a FlexAlgoNodeSegment for each. This replaces the activator's post- activation backfill that was lost during the activator removal. The CLI automatically discovers existing topologies and passes them to the instruction, so `doublezero device interface create ... --loopback-type vpn-ipv4` assigns all node segments atomically — no manual steps. Also rename BackfillTopology → AssignTopologyNodeSegments across the codebase (instruction discriminant 110 unchanged).
f15af55 to
99468eb
Compare
- cli/device-interface-get: cache list_topology in a HashMap so we don't refetch per flex-algo segment - create/update: validate topology accounts by first-byte AccountType::Topology in addition to program-owner - update: allow contributor.owner to drive update_topologies, not just the foundation allowlist - update: size kept Vec by max(existing, desired) so we don't realloc when the desired set grows - tests: add test_update_topologies_allowed_for_contributor exercising the contributor-owner authorization branch - changelog: add Unreleased entries covering this PR's smartcontract, controller, and CLI changes
I have decided Ben's way of implementing this is fine for now. If we hit a limit for account arguments, we can add another instruction to add more topologies. |
There was a problem hiding this comment.
Pull request overview
This PR addresses RFC-18 devnet rollout gaps by making flex-algo node-segment allocation/reconciliation atomic and automatic for VPNv4 loopbacks, and by ensuring tenantless users still receive default topology color stamping in the controller. It also performs a mechanical rename of the topology “backfill” instruction/command to “assign node segments” while keeping the instruction discriminant stable.
Changes:
- Smartcontract: extend
CreateDeviceInterface/UpdateDeviceInterface/DeleteDeviceInterfaceto allocate, reconcile, and deallocateFlexAlgoNodeSegmentSR IDs under onchain allocation. - Controller: resolve/stamp default
UNICAST-DEFAULTtopology color communities for tenantless users when FlexAlgo is enabled. - SDK/CLI: rename
BackfillTopology→AssignTopologyNodeSegments, update commands/tests, and enhancedevice interface getoutput to display flex-algo segments.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| smartcontract/sdk/rs/src/commands/topology/mod.rs | Switch topology command module export to assign_node_segments. |
| smartcontract/sdk/rs/src/commands/topology/create.rs | Update topology creation flow/tests to use AssignTopologyNodeSegments. |
| smartcontract/sdk/rs/src/commands/topology/assign_node_segments.rs | Rename command + wire instruction to AssignTopologyNodeSegments. |
| smartcontract/sdk/rs/src/commands/device/interface/update.rs | Add optional topology reconciliation support (topology PDA metas + args). |
| smartcontract/sdk/rs/src/commands/device/interface/create.rs | Add optional topology PDA metas/count for atomic flex-algo segment allocation. |
| smartcontract/programs/doublezero-telemetry/tests/test_helpers.rs | Update test helper args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/tests/unlink_device_interface_test.rs | Update interface create args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/tests/topology_test.rs | Rename Backfill → AssignNodeSegments and update args/imports. |
| smartcontract/programs/doublezero-serviceability/tests/resource_extension_test.rs | Update interface create args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/tests/link_wan_test.rs | Update interface create/update args for new topology reconciliation fields. |
| smartcontract/programs/doublezero-serviceability/tests/link_onchain_allocation_test.rs | Update interface create args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/tests/link_dzx_test.rs | Update interface create args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/tests/interface_test.rs | Update interface create/update args for new topology fields. |
| smartcontract/programs/doublezero-serviceability/tests/interface_onchain_allocation_test.rs | Add extensive flex-algo create/delete/update-topologies test coverage and update args. |
| smartcontract/programs/doublezero-serviceability/tests/global_test.rs | Update interface create args for new topology_count field. |
| smartcontract/programs/doublezero-serviceability/src/processors/topology/mod.rs | Switch topology processor module export to assign_node_segments. |
| smartcontract/programs/doublezero-serviceability/src/processors/topology/assign_node_segments.rs | Rename processor/args and keep a compatibility alias for old args name. |
| smartcontract/programs/doublezero-serviceability/src/processors/device/interface/update.rs | Implement update_topologies reconciliation logic (allocate/deallocate flex-algo SR IDs). |
| smartcontract/programs/doublezero-serviceability/src/processors/device/interface/delete.rs | Deallocate flex-algo SR IDs on interface delete (onchain deallocation). |
| smartcontract/programs/doublezero-serviceability/src/processors/device/interface/create.rs | Allocate flex-algo SR IDs during atomic VPNv4 loopback creation. |
| smartcontract/programs/doublezero-serviceability/src/instructions.rs | Rename instruction enum variant to AssignTopologyNodeSegments (discriminant 110 retained). |
| smartcontract/programs/doublezero-serviceability/src/entrypoint.rs | Route instruction 110 to the renamed processor entrypoint. |
| smartcontract/cli/src/topology/mod.rs | Switch CLI topology module export to assign_node_segments. |
| smartcontract/cli/src/topology/assign_node_segments.rs | Rename CLI command and update output/tests. |
| smartcontract/cli/src/doublezerocommand.rs | Rename CLI trait method from backfill to assign node segments. |
| smartcontract/cli/src/device/interface/update.rs | Add --topologies flag and plumb through to SDK update command. |
| smartcontract/cli/src/device/interface/get.rs | Display flex_algo_node_segments as topology_name:sr_id rows with topology lookup. |
| smartcontract/cli/src/device/interface/create.rs | Auto-discover existing topologies for VPNv4 loopback create to enable atomic assignment. |
| controlplane/doublezero-admin/src/cli/migrate.rs | Update migration command to call AssignTopologyNodeSegments API. |
| controlplane/controller/internal/controller/server.go | Apply default topology color stamping for tenantless users when FlexAlgo enabled. |
| client/doublezero/src/main.rs | Rename CLI subcommand dispatch Backfill → AssignNodeSegments. |
| client/doublezero/src/cli/link.rs | Rename CLI topology subcommand Backfill → AssignNodeSegments. |
| CHANGELOG.md | Document renamed instruction and new atomic create/update/delete flex-algo behaviors + controller stamping fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
CreateDeviceInterface: Vpnv4 loopbacks under onchain allocation now allocate aFlexAlgoNodeSegmentfor each existing topology in the same transaction. New/reprovisioned devices no longer need a separateAssignTopologyNodeSegmentsstep. The CLI auto-discovers topologies and passes them.UpdateDeviceInterface(update_topologies=true)reconciliation path: deallocate removed topologies, allocate added ones, preserve unchanged. Authorized for the contributor owner or the foundation allowlist.DeleteDeviceInterfacenow also deallocates the loopback's flex-algo node segments under onchain deallocation.UNICAST-DEFAULTtopology color for users whose access pass has no tenant.BackfillTopology→AssignTopologyNodeSegmentsacross the program, CLI, and Rust SDK (instruction discriminant 110 unchanged).Context
Two issues discovered during the RFC-18 devnet deployment:
1. Missing flex-algo node segments on new devices. The activator used to call
BackfillTopologyafter activating a VPNv4 loopback, assigningFlexAlgoNodeSegmententries for each existing topology. The activator was removed (#3607), and this responsibility was never ported. New or reprovisioned devices (e.g. dzd4 via the daily provisioning test) were missing node segments, causing the controller to logflex_algo.enabled=true but VPNv4 loopback has no flex_algo_node_segments.The fix extends the onchain
CreateDeviceInterfaceinstruction to accept optional topology PDA accounts and allocate oneFlexAlgoNodeSegmentper topology atomically.UpdateDeviceInterfacegains anupdate_topologiesmode for ongoing reconciliation when topologies are added/removed after device creation, andDeleteDeviceInterfacedeallocates the segments alongside the base loopback SR ID. Topology accounts are validated by program-owner and by the first byte (AccountType::Topology) — a cheap check that prevents passing arbitrary program-owned accounts.2. Community stamping skipping tenantless users.
resolveTenantColorswas only called inside the tenant lookup branch. All 985 unicast users on mainnet and 474/535 on testnet have no tenant, so they were silently getting no color. The fix adds an else branch that resolves the default UNICAST-DEFAULT color for tenantless users.Related RFC:
rfcs/rfc-0018-flex-algo.mdTesting Verification
cargo test -p doublezero-serviceability— full suite passes, including newtest_update_topologies_*cases (adds/removes/clear/full-swap/no-op/rejects-non-vpnv4/rejects-duplicates) andtest_update_topologies_allowed_for_contributorwhich verifies a non-foundation contributor-owner key can driveupdate_topologiescargo test -p doublezero_cli device::interface::get—flex_algo_node_segmentsnow render astopology_name:sr_idrows;list_topologyis fetched once and cached per get